${\bf \large \cal Hovhannes \ Grigoryan}\\ {\rm \small New \ York, \ NY}$
Calculation of some observables can be done either through sampling or through integration. In many instances, the Monte Carlo sampling is superior to integration, especially in higher dimensions. In this part, we start by trivial example on how to sample points from the gaussian distribution, by transforming the variables inside the gaussian integral, in such a way that these new variables are uniformly distributed. Then we show how to sample points uniformly on a sphere and on a ball either using direct sampling or using Markov-chain algorithm. The latter algorithm allows to implement a random walk on a sphere or on a circle. We discuss how Metropolis acceptance probability in Markov-chain method can be used to sample points from an arbitrary distribution, and other competing algorithms using direct sampling. We also discuss rejection free tower sampling and Walker algorithms. We consider an example of inverse square root distribution, discuss problems associated with it, and how to solve them. We conclude by calculating volume and area of the d-dimensional hypersphere using Monte Carlo methods [1].
Gaussian Distribution from Uniform Distribution
Direct way to sample points on a unit sphere
Sampling uniform distribution inside the unit ball
Metropolis Acceptance Probability
Direct Sampling Rejection Cut Algorithm
Tower Sampling Method (rejection free)
Walker Algorithm
Calculation of High Dimensional Integrals Using Monte Carlo Methods
[1] We closely follow the outline of lectures by W. Krauth, et al., "Statistical Mechanics: Algorithms and Computations". OUP Oxford, 2006
%%html
<style>
body, p, div.rendered_html {
color: black;
font-family: "Times Roman", sans-serif;
font-size: 12pt;
}
</style>
from __future__ import division
import random
import numpy as np
from itertools import combinations
%precision 20
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
from matplotlib import animation, rc, cm
from mpl_toolkits.mplot3d import axes3d, Axes3D
from IPython.display import HTML, Image, clear_output
plt.rcParams['figure.figsize'] = (14,8)
plt.style.use('fivethirtyeight')
rc('animation', html='html5')
import warnings
warnings.filterwarnings('ignore')
import sys, os
sys.path.append(os.path.abspath(os.path.join('..')))
from utils.make_gif import make_gif, clear_data
from utils.html_converter import html_converter
Consider the example of estimating $\pi$ from P1. We can do this either by sampling or by integration:
where, $\pi(x,y)=1$ is uniform probability density inside the square. It is clear that Monte Carlo sampling methods should excel at higher dimensions.
Consider 1D gaussian integral $I = \int^\infty_{-\infty}\frac{\text{d}x}{\sqrt{2\pi}}e^{-x^2/2}$, which is known to be 1. We calculate it as follows:
$I^2=\int^\infty_{-\infty}\frac{\text{d}y}{\sqrt{2\pi}} \int^\infty_{-\infty}\frac{\text{d}x}{\sqrt{2\pi}}e^{-(x^2+y^2)/2}=\int^{2\pi}_0 \frac {\text{d}\phi}{2\pi}\int^\infty_0 r e^{-r^2/2} \text{d}r =\int^{2\pi}_0 \frac {\text{d}\phi}{2\pi}\int^\infty_0 e^{-\psi} \text{d}\psi = \int^{2\pi}_0 \frac {\text{d}\phi}{2\pi}\int^1_0 \text{d}\Upsilon=1$,
where $x=r \cos\phi$, $y = r \sin \phi$, $\psi = r^2/2$, $\Upsilon = e^{-\psi}$
The variable transformations indicate a transformation between uniformly distributed random variables $\Upsilon, \phi$ and gaussian distributed random variables $x, y$. Changes of variables in integrals also apply to samples. This is a relation between the integration variables on the exponent of the gaussian and the gaussian distributed random variables.
def gauss_test(sigma):
phi = random.uniform(0.0, 2.0 * np.pi)
Upsilon = random.uniform(0.0, 1.0)
Psi = - np.log(Upsilon)
r = sigma * np.sqrt(2.0 * Psi)
x = r * np.cos(phi)
y = r * np.sin(phi)
return [x, y]
# exact distrubution:
list_x = [i * 0.1 for i in range(-40, 40)]
list_y = [np.exp(- x ** 2 / 2.0) / (np.sqrt(2.0 * np.pi)) for x in list_x]
# sampled distribution:
n_sampled_pairs = 50000
data = []
for sample in range(n_sampled_pairs):
data += gauss_test(1.0)
# graphics output
fig = plt.figure(figsize=(12,8))
plt.plot(list_x, list_y, color='b', label='exact', lw=0.5)
plt.hist(data, bins=150, density=True, color='r', histtype='step', label='sampled');
plt.legend()
plt.title('Sampling of the gaussian distribution')
plt.xlabel('$x$', fontsize=14)
plt.ylabel('$\pi(x)$', fontsize=14)
plt.show()
# sampled distribution:
n_sampled_pairs = 50000
data = []
for sample in range(n_sampled_pairs):
data.append(gauss_test(1.0))
X = np.random.randn(n_sampled_pairs)
Y = np.random.randn(n_sampled_pairs)
fig = plt.figure(figsize=(8,8))
plt.scatter(X, Y, color='b', label='exact', lw=0.3, alpha=0.7, marker='x') # by 'exact' we mean generated by a better algorithm
plt.scatter(np.array(data)[:,0], np.array(data)[:,1], color='r', label='sampled', lw=0.3, alpha=0.7, marker='x')
plt.legend()
plt.title('Sampling of the gaussian distribution')
plt.axis('scaled')
plt.xlabel('$x$', fontsize=14)
plt.ylabel('$y$', fontsize=14)
plt.show()
As was shown above, change of variables in the integral affects the distribution of samples. Therefore,
$1 = I^d= \left(\frac{1}{\sqrt{2\pi}}\right)^d \int^{\infty}_{-\infty} \dots \int^{\infty}_{-\infty}\text{d}x_0 \dots \text{d}x_{d-1} e^{-(x_0^2+ \dots + x^2_{d-1})/2} = \left(\frac{1}{\sqrt{2\pi}}\right)^d \int^\infty_{0} \text{d}r~ r^{d-1} e^{-r^2/2} \int \text{d}\Omega $
where we went from cartesian $(x_0, \dots, x_{d-1})$ to polar coordinates $(r, \Omega)$. Since angles in $\Omega$ are uniformly distributed, we can implement the following algorithm to uniformly sample the surface of the sphere.
%%time
x_list, y_list, z_list = [], [], []
nsamples = 6000
for sample in range(nsamples):
x, y, z = random.gauss(0.0, 1.0), random.gauss(0.0, 1.0), random.gauss(0.0, 1.0)
radius = np.sqrt(x ** 2 + y ** 2 + z ** 2)
x_list.append(x / radius)
y_list.append(y / radius)
z_list.append(z / radius)
# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.plot(x_list, y_list, z_list, '.')
plt.title('Uniform sampling of the sphere surface')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.show()
Another implementation of the same code.
%%time
nsamples = 6000
xyz = np.random.randn(nsamples, 3)
lens = np.linalg.norm(xyz, axis=1)
xyz_normed = xyz/lens.reshape(-1,1)
x_list, y_list, z_list = xyz_normed[:, 0], xyz_normed[:, 1], xyz_normed[:, 2]
# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.plot(x_list, y_list, z_list, '.')
plt.title('Uniform sampling of the sphere surface')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.show()
d = 3 # dimension
delta = 0.2 # step size
sigma = 1.0/np.sqrt(d)
n_samples = 1000
data = []
x = np.random.randn(d)
x /= np.linalg.norm(x)
for _ in range(n_samples):
x = x.copy()
eps = sigma * np.random.randn(d)
P = np.dot(eps, x)
eps -= P * x
eps /= np.linalg.norm(eps)
Upsilon = random.uniform(-delta, delta)
x += Upsilon * eps
x /= np.linalg.norm(x)
data.append(x)
adata = np.array(data)
x_list, y_list, z_list = adata[:, 0], adata[:, 1], adata[:, 2]
# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.plot(x_list, y_list, z_list, '.')
plt.title('Uniform sampling of the sphere surface')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.show()
def plot3D_markov_sphere(old_data, new_point, step, save=False):
fig = plt.figure(figsize=(10, 10))
grid = plt.GridSpec(8, 4, hspace=0.3, wspace=0.1)
# set up the axes for the first plot
ax = fig.add_subplot(grid[:-1, 1:], projection='3d')
# draw sphere
u, v = np.mgrid[0:2*np.pi:200j, 0:np.pi:100j]
x = np.cos(u)*np.sin(v)
y = np.sin(u)*np.sin(v)
z = np.cos(v)
ax.plot_surface(x, y, z, rstride=1, cstride=1, cmap=cm.coolwarm,
linewidth=0, antialiased=False, alpha=0.2)
# keep past points
xa, ya, za = np.array(X), np.array(Y), np.array(Z)
ax.plot(xa, ya, za, 'k.', alpha=0.4)
# draw new point
x_n, y_n, z_n = x_new
ax.plot([x_n], [y_n], [z_n], 'ro')
plt.title('Random walk on the sphere t='+step, fontsize=18)
ax.set_xlabel('$x$', fontsize=18)
ax.set_ylabel('$y$', fontsize=18)
ax.set_zlabel('$z$', fontsize=18)
ax.set_xlim(-1, 1)
ax.set_ylim(-1, 1)
ax.set_zlim(-1, 1)
ax.set_aspect('equal')
theta = np.arccos(za/np.sqrt(xa**2 + ya**2 + za**2))
phi = np.arctan(ya/xa)
tt = np.linspace(0, np.pi, 100)
dist_theta = 0.5*np.sin(tt)
ax2 = fig.add_subplot(grid[:-1, 0])
ax2.hist(theta, bins=20, density=True, histtype='stepfilled', orientation='horizontal')
ax2.plot(dist_theta, tt, 'k--', label="$0.5 \sin(\\theta)$")
ax2.set_ylim(0, np.pi)
ax2.legend(loc="upper right")
ax2.invert_yaxis()
plt.title('Polar angle ($\\theta$)', fontsize=18)
ax3 = fig.add_subplot(grid[-1, 1:])
ax3.hist(phi, bins=20, density=True, histtype='stepfilled', orientation='vertical')
ax3.set_xlim(-np.pi/2, np.pi/2)
ax3.invert_xaxis()
plt.title('Azimuthal angle ($\phi$)', fontsize=18)
if save:
plt.savefig('../data/random_walk_sphere_t='+step+'.png', transparent=True)
plt.show()
def plot2D_markov_sphere(old_data, new_point, step, save=False):
plt.rcParams['figure.figsize'] = (18, 10)
# set up a figure twice as wide as it is tall
fig = plt.figure(figsize=plt.figaspect(0.5))
# set up the axes for the first plot
ax = fig.add_subplot(1, 2, 1)
# keep past points
X, Y = old_data
xa, ya = np.array(X), np.array(Y)
ax.plot(xa, ya, 'k.', alpha=0.4)
# draw new point
x_n, y_n = new_point
ax.plot([x_n], [y_n], 'ro')
plt.title('Random walk on the sphere t='+step, fontsize=18)
ax.set_xlabel('$x$', fontsize=18)
ax.set_ylabel('$y$', fontsize=18)
ax.set_xlim(-1.2, 1.2)
ax.set_ylim(-1.2, 1.2)
ax2 = fig.add_subplot(1, 2, 2)
phi = np.arctan(ya/xa)
ax2.hist(phi, bins=20, density=True)
plt.title('Azimuthal angle ($\phi$) at t='+step, fontsize=18)
if save:
plt.savefig('../data/random_walk_sphere_t='+step+'.png', transparent=True)
plt.show()
def markov_sphere_move(x, delta=0.2, d=3):
sigma = 1.0/np.sqrt(d)
eps = sigma * np.random.randn(d)
P = np.dot(eps, x)
eps -= P * x
eps /= np.linalg.norm(eps)
u = random.uniform(-delta, delta)
x += u * eps
x /= np.linalg.norm(x)
return x
tmax = 100
maxlength = len(str(tmax - 1))
delta = 0.5
d = 3
x = np.random.randn(d)
x_init = x/np.linalg.norm(x)
X, Y, Z = [], [], []
for t in range(tmax):
number_string = str(t).zfill(maxlength)
x_new = markov_sphere_move(x_init, delta, d)
x_init = x_new
X.append(x_new[0])
Y.append(x_new[1])
Z.append(x_new[2])
if t % 10 == 0:
clear_output(wait=True)
plot3D_markov_sphere((X, Y, Z), x_new, number_string, save=True)
make_gif('random_walk_sphere_t', 0.0)
%%html
<div style="display: flex; justify-content: row;">
<img src='data/random_walk_sphere_t.gif'>
</div>
tmax = 500
maxlength = len(str(tmax - 1))
delta = 1/np.sqrt(7.0)
d = 2
x = np.random.randn(d)
x_init = x/np.linalg.norm(x)
X, Y = [], []
for t in range(tmax):
number_string = str(t).zfill(maxlength)
x_new = markov_sphere_move(x_init, delta, d)
x_init = x_new
X.append(x_new[0])
Y.append(x_new[1])
clear_output(wait=True)
plot2D_markov_sphere((X, Y), x_new, number_string)
To uniformly sample the inside of a ball, we need to sample radial points using distribution:
$$ \pi(r) = d r^{d-1}, \ \ \ 0 < r < 1 $$We can sample these points through Uniform$([0,1])^{1/d}$, using universality of uniform theorem, which states the following:
Indeed, the CDF corresponding to $\pi(r)$ is:
$$ F_R(r) = r^{d} \ \implies F_R^{-1}(u) = u^{1/d} \sim \pi(R) $$x_list, y_list, z_list = [],[],[]
nsamples = 6000
for sample in range(nsamples):
x, y, z = random.gauss(0.0, 1.0), random.gauss(0.0, 1.0), random.gauss(0.0, 1.0)
length = random.uniform(0.0, 1.0) ** (1.0 / 3.0) / np.sqrt(x ** 2 + y ** 2 + z ** 2)
x, y, z = x * length, y * length, z * length
x_list.append(x)
y_list.append(y)
z_list.append(z)
# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.title('Uniform sampling inside the sphere')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.plot(x_list, y_list, z_list, '.')
plt.show()
The essence of Metropolis acceptance probability: perform a random uniform sampling, and accept the new sample with probability:
$$p(a\to b) = \min(1, \pi(b)/\pi(a)),$$where $a$ is the initial state, $b$ is the new state, and $\pi(a)$ is the probability distribution.
Computationally, to generate a sample from $\pi(x)$ distribution,
Start from initial state $a$, and randomly (uniformly) move to potential state $b$
Generate a uniformly distributed random variable $\Upsilon \sim \text{Uniform}([0, 1])$
If $\Upsilon < \pi(b)/\pi(a),$ accept the move to a new state $b$, otherwise, reject it.
x = 0.0
delta = 0.5
data = []
N = 50000
for k in range(N):
x_new = x + random.uniform(-delta, delta)
if random.uniform(0.0, 1.0) < np.exp(- x_new ** 2 / 2.0) / np.exp(- x ** 2 / 2.0):
x = x_new
data.append(x)
t = np.linspace(-3.5, 3.5, N)
y = np.exp(- t ** 2 / 2.0) / np.sqrt(2.0 * np.pi)
fig = plt.figure(figsize=(12,8))
plt.hist(data, 100, density = 'True', label="generated samples")
plt.plot(t, y, c='red', linewidth=2.0, label="$exp(-x^2/2)/\sqrt{2\pi}}$")
plt.title('Theoretical Gaussian distribution $\pi(x)$ and \
\nnormalized histogram for '+str(len(data))+' samples', fontsize = 18)
plt.xlabel('$x$', fontsize = 18)
plt.ylabel('$\pi(x)$', fontsize = 18)
plt.legend(loc="upper right")
plt.show()
Drop points into rectangle that frames the original distribution.
N.B. Rejections here are related to direct sampling algorithm.
y_max = 1.0 / np.sqrt(2.0 * np.pi)
x_cut = 3.5
n_data = 10000
n_accept = 0
data = []
X, Y = [], []
while n_accept < n_data:
y = random.uniform(0.0, y_max)
x = random.uniform(-x_cut, x_cut)
if y < np.exp( - x **2 / 2.0) / np.sqrt(2.0 * np.pi):
n_accept += 1
data.append(x)
X.append(x)
Y.append(y)
t = np.linspace(-3.5, 3.5, N)
yt = np.exp(- t ** 2 / 2.0) / np.sqrt(2.0 * np.pi)
fig = plt.figure(figsize=(12,8))
plt.hist(data, 100, density = 'True', label="x histogram of samples", alpha=0.8)
plt.scatter(X, Y, color='black', marker='.', linewidths=0.1, alpha=0.6, label="(x,y) point samples")
plt.plot(t, yt, c='red', linewidth=2.0, label="$exp(-x^2/2)/\sqrt{2\pi}}$")
plt.title('Theoretical Gaussian distribution $\pi(x)$ and \
\nnormalized histogram for '+str(len(data))+' samples', fontsize = 18)
plt.xlabel('$x$', fontsize = 18)
plt.ylabel('$\pi(x)$', fontsize = 18)
plt.xlim([-x_cut, x_cut])
plt.ylim([0, y_max])
plt.legend(loc="upper right")
plt.show()
This method fails, e.g., for the inverse sqrt distribution, because we don't know what box size to choose. For larger boxes the rejection rate would be very large rendering our algorithm inefficient.
y_max = 100.0
x_cut = 1.0
n_data = 10000
data = []
n_accept = 0
while n_accept < n_data:
y = random.uniform(0.0, y_max)
x = random.uniform(0.0, x_cut)
if y < 1.0 / (2.0 * np.sqrt(x)):
n_accept += 1
data.append(x)
plt.hist(data, bins=100, density='True', label="samples")
t = np.linspace(0.01, 1, n_data)
yt = 1.0 / (2.0 * np.sqrt(t))
plt.plot(t, yt, 'red', linewidth = 2, label="$1/(2\sqrt{x})$")
plt.title('Theoretical distribution $\pi(x)={1}/{(2 \sqrt{x})}$ and normalized\
\n histogram for '+str(n_accept)+' accepted samples',fontsize=16)
plt.xlabel('$x$', fontsize=18)
plt.ylabel('$\pi(x)$', fontsize=18)
plt.legend(loc="upper right")
plt.show()
In this respect MC algorithm is better, since the Markov-chain algorithm does not have problems with infinite probability densities, and it doesn't need to specify any box size. In the algorithm below, we perform Markov-chain sampling with Metropolis acceptance probability, and record the maximal value of $y$ and minimal value of $x$. We can observe, that the algorithm is able to move to very small values of $x$ very rapidly.
x = 0.2
delta = 0.5
data = []
y_max = 0
n_trials = 100000
for k in range(n_trials):
x_new = x + random.uniform(-delta, delta)
if 0.0 < x_new < 1.0:
if random.uniform(0.0, 1.0) < np.sqrt(x) / np.sqrt(x_new):
x = x_new
if 0.5 / np.sqrt(x) > y_max:
y_max = 0.5 / np.sqrt(x)
print(y_max, x, k)
data.append(x)
t = np.linspace(0.01, 1, n_data)
yt = 1.0 / (2.0 * np.sqrt(t))
fig = plt.figure(figsize=(12,8))
plt.hist(data, bins=100, density='True', label="samples")
plt.xlabel('$x$', fontsize=16)
plt.ylabel('$\pi(x)$', fontsize=16)
plt.plot(t, yt, linewidth=1.5, color='r', label="exact")
plt.title('Theoretical distribution $\pi(x)={1}/{(2 \sqrt{x})}$ and normalized\
\n histogram for ' + str(len(data)) + ' samples', fontsize=16)
plt.legend(loc="upper right")
plt.show()
Assume that the probability to do either of the activities $(a, b, c, d)$ is $\pi_a, \pi_b, \pi_c, \pi_d$. Then we can use the direct sampling "cut" algorithm discussed above, to determine which activity to select. We simply throw a point into the rectangle that contains given the probabilities, and select the activity, if it is below the "curve". However, this method may have a high rejection rate. That is why it is more efficient to use the rejection free tower sampling algorithm. In this algorithm, we place probabilities on top of each other, like in a tower, and throw a point uniformly into this tower.
def bisection_search(eta, w_cumulative):
"""Bisection search to find the bin corresponding to eta."""
kmin = 0
kmax = len(w_cumulative)
while True:
k = int((kmin + kmax) / 2)
if w_cumulative[k] < eta:
kmin = k
elif w_cumulative[k - 1] > eta:
kmax = k
else:
return k - 1
def tower_sample(weights):
"""Sample an integer number according to weights."""
sum_w = sum(weights)
w_cumulative = [0.0]
for l in range(len(weights)):
w_cumulative.append(w_cumulative[l] + weights[l])
eta = random.random() * sum_w
sampled_choice = bisection_search(eta, w_cumulative)
return sampled_choice
weights = [0.4, 0.3, 0.8, 0.1, 0.2]
n_samples = 20
for sample in range(n_samples):
print(tower_sample(weights), end="--")
Although the tower sampling algorithm is rejection free it is not optimal. That is why we will consider Walker algorithm. In this algorithm, assume probabilities are boxes of unit width and height that is equal to the probability. The algorithm goes as follows:
First compute the average probability
Put the largest box on top of the smallest box, and return the part above the average line back to its place
Continue the procedure above until all boxes are of average height, and at most two boxes are in each place
Sample a point into these boxes to determine the activity
However, Walker algorithm is optimal for discrete distributions.
# There are N=5 options with probability pi(a)=[prob, number]
N = 5
pi = [[1.1 / 5.0, 0], [1.9 / 5.0, 1], [0.5 / 5.0, 2], [1.25 / 5.0, 3], [0.25 / 5.0, 4]]
x_val = [a[1] for a in pi]
y_val = [a[0] for a in pi]
# average probability
pi_mean = sum(y_val) / float(N)
long_s = []
short_s = []
for p in pi:
if p[0] > pi_mean:
long_s.append(p)
else:
short_s.append(p)
table = []
for k in range(N - 1):
e_plus = long_s.pop()
e_minus = short_s.pop()
table.append((e_minus[0], e_minus[1], e_plus[1]))
e_plus[0] = e_plus[0] - (pi_mean - e_minus[0])
if e_plus[0] < pi_mean:
short_s.append(e_plus)
else:
long_s.append(e_plus)
if long_s:
table.append((long_s[0][0], long_s[0][1], long_s[0][1]))
else:
table.append((short_s[0][0], short_s[0][1], short_s[0][1]))
print(table)
samples = []
n_samples = 10000
for k in range(n_samples):
Upsilon = random.uniform(0.0, pi_mean)
i = random.randint(0, N - 1)
if Upsilon < table[i][0]:
samples.append(table[i][1])
else:
samples.append(table[i][2])
plt.figure()
plt.hist(samples, bins=N, range=(-0.5, N - 0.5), normed=True)
plt.plot(x_val, y_val, 'ro', ms=8)
plt.title("Histogram using Walker's method for a discrete distribution\n\
on $N=$" + str(N) + " choices (" + str(n_samples) + " samples)", fontsize=18)
plt.xlabel('$k$', fontsize=20)
plt.ylabel('$\pi(k)$', fontsize=20)
plt.show()
import scipy.special
n_trials = 100000
data = []
for trial in range(n_trials):
Upsilon = random.uniform(0.0, 1.0)
x = np.sqrt(2.0) * scipy.special.erfinv(2.0 * Upsilon - 1.0)
data.append(x)
t = np.linspace(-4, 4, n_trials)
plt.plot(t, np.exp( - t **2 / 2.0) / np.sqrt(2.0 * np.pi))
h = plt.hist(data, bins=100, range=(-4, 4), density=True)
plt.title("Generating standard normal distribution from uniform using inverse CDF", fontsize=18)
plt.show()
gamma = -0.5
n_trials = 10000
data = []
for _ in range(n_trials):
u = random.uniform(0, 1)
x = u ** (1.0/(gamma + 1.0))
data.append(x)
t = np.linspace(0.01, 1, n_data)
yt = 1.0 / (2.0 * np.sqrt(t))
fig = plt.figure(figsize=(12,8))
plt.hist(data, bins=100, density='True', label="samples")
plt.xlabel('$x$', fontsize=16)
plt.ylabel('$\pi(x)$', fontsize=16)
plt.plot(t, yt, linewidth=1.5, color='r', label="exact")
plt.title('Theoretical distribution $\pi(x)={1}/{(2 \sqrt{x})}$ and normalized\
\n histogram for ' + str(len(data)) + ' samples', fontsize=16)
plt.legend(loc="upper right")
plt.show()
$Q(d) \equiv 2V_{d, sph}(1)/V_{d, cyl}(1) = V_{d, sph}(1)/V_{d-1, sph}(1) = \sqrt{\pi} \frac{\Gamma\left(\frac{d+1}{2}\right)}{\Gamma\left(\frac{d}{2}+1\right)}$
# samples points (x, y) inside the unit disk, rather than inside the square
# at each step, sample a new value of z as random.uniform(-1.0, 1.0), and count as a "hit" if x^2 + y^2 + z^2 < 1.0.
x, y = 0.0, 0.0 # initial position
delta = 0.5 # step size
N_R = 400000 # number of steps
N_C = 0 # number of times we are inside the circle
for _ in range(N_R):
z = random.uniform(-1.0, 1.0)
dx, dy = random.uniform(-delta, delta), random.uniform(-delta, delta)
if (x + dx)**2 + (y + dy)**2 < 1.0:
x, y = x + dx, y + dy
if x ** 2 + y ** 2 + z ** 2 < 1.0:
N_C += 1
Q_3 = 2*N_C/N_R
print("<Q(3)> is:", Q_3)
# sample points (x,y,z) inside the 3D unit sphere.
# at each step, sample a w as random.uniform(-1.0, 1.0), and count as a "hit" if (x^2 + y^2 + z^2 + w^2) < 1.
x, y, z = 0.0, 0.0, 0.0 # initial position
delta = 0.5 # step size
N_R = 400000 # number of steps
N_C = 0 # number of times we are inside the circle
for _ in range(N_R):
w = random.uniform(-1.0, 1.0)
dx, dy, dz = random.uniform(-delta, delta), random.uniform(-delta, delta), random.uniform(-delta, delta)
if (x + dx)**2 + (y + dy)**2 + (z + dz)**2 < 1.0:
x, y, z = x + dx, y + dy, z + dz
if x ** 2 + y ** 2 + z ** 2 + w ** 2 < 1.0:
N_C += 1
Q_4 = 2*N_C/N_R
V_3 = 4*np.pi/3.0
V_4 = V_3 * Q_4
print("Estimated <Q(4)> =", Q_4, ", exact Q(4) =", (np.pi**2/2)/V_3)
print("Estimated <V(4)> =", V_4, ", exact V(4) =", np.pi**2/2)
def rand_uniform_sphere(d, N):
""" Uniformly samples points inside d-D unit sphere.
At each iteration we modify only one component.
Returns: array of N samples (rows) and d columns
"""
x = [0.0]*d
old_radius_square = 0.0
delta = 0.5
samples = []
rads = []
for _ in range(N):
k = random.randint(0, d - 1)
step = random.uniform(-delta, delta)
x_old_k = x[k]
x_new_k = x_old_k + step
new_radius_square = old_radius_square + x_new_k ** 2 - x_old_k ** 2
if new_radius_square < 1.0:
x[k] = x_new_k
old_radius_square = new_radius_square
samples += x # appending lists in iteration causes problems, this is why we extend list
rads.append(np.sqrt(old_radius_square))
return np.array(samples).reshape(-1, d), rads
def Q(d, N):
D = d-1
x = [0.0]*D
old_radius_square = 0.0
delta = 0.5
samples = []
N_C = 0
for _ in range(N):
z = random.uniform(-1, 1)
k = random.randint(0, D - 1)
step = random.uniform(-delta, delta)
x_old_k = x[k]
x_new_k = x_old_k + step
new_radius_square = old_radius_square + x_new_k ** 2 - x_old_k ** 2
if new_radius_square < 1.0:
x[k] = x_new_k
old_radius_square = new_radius_square
if old_radius_square + z**2 < 1:
N_C += 1
return 2*N_C/float(N)
def vol_sphere_exact(d):
return np.pi**(d/2.0)/np.math.gamma(d/2.0 + 1.0)
def vol_sphere(d, N):
V_2 = np.pi
if d == 2:
return V_2
elif d < 2:
raise ValueError("Number of dimensions should be no less than 2!!")
else:
V_d = V_2
for i in range(3, d+1):
V_d *= Q(i, N)
V_d_exact = vol_sphere_exact(d)
error = round(100*(V_d - V_d_exact)/V_d_exact, 4)
print(f"For d = {d}, estimated <V_d> = {V_d}, exact V_d = {V_d_exact}, error = {error}%")
return V_d
V_d = vol_sphere(200, 100000)
diffs = []
Ns = [10**k for k in range(2, 7)]
for N in Ns:
V_d = vol_sphere(20, N)
V_d_exact = vol_sphere_exact(20)
diffs.append(V_d_exact - V_d)
samples, rads = rand_uniform_sphere(2, 10000)
h = plt.hist(rads, bins=100, density=True)
s = plt.scatter(samples[:, 0], samples[:, 1])
plt.axis('scaled')
N = 1000000
samples, rads = rand_uniform_sphere(20, N)
t = np.linspace(0, 1, N)
h = plt.hist(rads, bins=50, density=True)
plt.plot(t, 20*t**19)
volume, dimension = [], []
def V_sph(dim):
return np.pi ** (dim / 2.0) / np.math.gamma(dim / 2.0 + 1.0)
for d in range(0,20):
dimension.append(d)
volume.append(V_sph(d))
plt.plot(dimension, volume, 'bo-')
plt.title('Volume of the unit hypersphere in $d$ dimensions')
plt.xlabel('$d$', fontsize = 15)
plt.ylabel('$V_{sph}(d)$', fontsize = 15)
plt.xlim(0,20)
plt.ylim(0,6)
for i in range(0,20):
plt.annotate(round(volume[i],2), (dimension[i], volume[i]), xytext=(10, 0), ha='left',
textcoords='offset points')
plt.grid(True)
notebook_file = r"P3_Sampling_and_Integration.ipynb"
html_converter(notebook_file)